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In the evolutionary version of the minority game, agents update their strategies (gene- value p) in 
order to improve their performance. Motivated by recent intriguing results obtained for prize-to- fine 
ratios which are smaller than unity, we explore the system's dynamics with a strategy updating rule 
of the form p — *• p ± 5p (0 < p < 1). We find that the strategy distribution depends strongly 
on the values of the prize-to-fine ratio R, the length scale Sp, and the type of boundary condition 
used. We show that these parameters determine the amplitude and frequency of the the temporal 
oscillations observed in the gene space. These regular oscillations are shown to be the main factor 
which determines the strategy distribution of the population. In addition, we find that agents 
characterized by p = | (a coin-tossing strategy) have the best chances of survival at asymptotically 
long times, regardless of the value of Sp and the boundary conditions used. 



The Minority Game (MG) is a successful model de- 
scribing a population of competing and evolving indi- 
viduals. This complex system has been explored exten- 
sively in the last few years, see e.g., [1-21] and references 
therein. The present work is mainly motivated by the 
recent results of [20,15]. 

In this toy model, a population of N agents with lim- 
ited information and capabilities repeatedly compete for 
a limited global resource, or to be in the minority. The 
desire to be in a minority group is found in many real 
life situations, such as: financial markets, traffic jams, or 
among a group of predators (who wish to hunt in areas 
with fewer competitors). 

At each round of the game, every individual has to 
choose whether to be in room '0' (e.g., choosing to sell 
an asset) or in room '1' (e.g., choosing to buy an asset). 
At the end of each turn, agents belonging to the smaller 
group (the minority) are the winners, each of them gains 
R points (the "prize"), whereas the others lose a point 
(the "fine"). The agents share a common look-up ta- 
ble, containing the outcomes of recent occurrences. This 
allows the determination of a "predicted trend" in the 
system, which is followed by each agent with probability 
p, known as the agent's "gene" value. 

In the evolutionary formulation of the model (EMG) 
agents are allowed to evolve ("mutate") their strategies 
based on past experience. If an agent score falls below 
some value d, he mutates - its gene value is modified. In 
this sense, each agent tries to learn from his past mis- 
takes, and to adjust his strategy in order to perform bet- 
ter. 

A remarkable conclusion deduced from the EMG [5] 
is that a population of competing agents tends to self- 
segregate into opposing groups characterized by extreme 
behavior. It was realized that in order to flourish in such 
situations, an agent should behave in an extreme way 
(p = or p = 1) [5]. On the other hand, in many real life 
situations the prize-to-fine ratio may take a variety of dif- 



ferent values [15,13]. A different kind of strategy may be 
more favorite in such situations. In recent studies it was 
found [15] that an intriguing phase transition exist in the 
model: "confusion" and "indecisiveness" take over when 
the prizc-to-finc ratio falls below some critical value, in 
which case agents characterized by a coin-tossing strat- 
egy (j) = i) perform better than extreme ones. In such 
circumstances agents tend to cluster around P = \ (see 
Fig. 1 of Ref. [15]) rather than self-segregate into two 
opposing groups. 

In [15] we have considered a uniform strategy updating 
rule in which the new strategy (of a mutating agent) is 
chosen uniformly within the range < p < 1. Burgus, 
Ceva and Perazzo [20] have recently considered the same 
model problem, with an updating rule of the form p —> 
p ± Sp, where dp < h, and found that the population 
tends to form an M-shaped strategy distribution in the 
R < 1 case. In the present work we further explore this 
system, and provide some new insights that extend and 
link the results of [15] to those of [20]. 

First, we would like to stress the importance of the 
chosen boundary conditions in the case of an updating 
rule of the form p —> p ± Sp [5]. Figure 1 displays the 
long-time averaged gene distribution P(p) of the agents 
for two different types of boundary conditions: periodic 
and reflective. One finds that for periodic boundary con- 
ditions the population tends to cluster at intermediate 
gene values. The curve between the two peaks, located 
at p — Sp and p = 1 — Sp, is almost flat, while agents 
with extreme gene values (p ~ and p ~ 1) perform 
much worse (we shall shortly demonstrate that the gene 
distribution may also have an inverse-U shape, depend- 
ing on the precise values of R and Sp). On the other 
hand, the gene distribution is almost flat for reflective 
boundary conditions. 

The underlying mechanism which is responsible for 
this important difference are the temporal oscillations ob- 
served in the winning probabilities of the agents [15,16]. 
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FIG. 1. The strategy distribution P(p) for periodic bound- 
ary conditions (solid line) and reflective boundary condi- 
tions (dashed line). The results are for N — 10001 agents, 
R = 0.8, d — —4, and Sp = 0.1. Each point represents an 
average value over 10 runs and 100000 time steps per run. 

Figure 2 displays the time dependence of the winning 
probability of a p — agent (the winning probability of a 
central agent, withp = ~, is practically constant in time). 
We consider three distinct cases, characterized by: (i) 
Sp = 0.1 with periodic boundary conditions, (ii) dp = 0.1 
with reflective boundary conditions, and (iii) uniform up- 
dating rule. One finds smaller oscillation amplitudes and 
longer periods for reflective boundary conditions, as com- 
pared to the case of periodic boundary conditions. This 
implies that, for reflective boundary conditions the per- 
formance of extreme agents (p = and p — 1) becomes 
quite similar to the performance of central agents (char- 
acterized by p = i), implying a flatter gene distribution 
for these boundary conditions. On the other hand, for 
periodic boundary conditions one finds that the tempo- 
ral oscillations are much more similar to the uniform case 
studied in [15,16] (as compared to the case of reflective 
boundary conditions). Indeed, the ratio P (4) : P(Q) for 
periodic boundary conditions is very similar to the corre- 
sponding ratio in the uniform case (compare Fig. 1 with 
Fig. 1 of [15]). 

Figure 3 shows the strategy distribution of the popu- 
lation for different prize-to-fine ratios, and with Sp <C 1. 
The results demonstrate the existence of a stable phase 
characterized by an inverse-U shaped gene distribution. 
However, unlike the uniform case [15], the critical value 
of R which separates the inverse-U distribution from the 
M-shaped one does not equal 1 (in the N — > oo limit). 

In Fig. 4 we display P{p) for various different Sp val- 
ues with periodic boundary conditions. We find that the 
peaks of the strategy distribution (for prize-to-fine ratios 
which are large enough to allow an M-shaped gene distri- 
bution) occurs at p = Sp and its symmetric counterpart 
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FIG. 2. Temporal dependence of the winning probabilities 
r(p — 0), for three distinct cases: (i) Sp = 0.1 with periodic 
boundary conditions, (ii) Sp = 0.1 with reflective boundary 
conditions, and (iii) uniform updating rule. The results are 
for N = 10001 agents, R = 0.8, and d = -4. 

1 — Sp. Regardless of the value of Sp, the agents do not self 
segregate - the extreme strategies (p = and p = 1) per- 
form worst. The strategy distribution moves smoothly 
into an inverse-U shape in the limit of Sp — | [15]. Fig- 
ure 5 displays the same results for reflective boundary 
conditions, where Sp = 1 is equivalent to the uniform 
updating rule [15]. 

Figure 6 displays the average lifespan <L(p)> of the 
agents. In order to get a better picture of the lifespan 
distribution, we also plot <L(p)>+a l(p) as a function 
of the gene value p. Here ax (p) is the root mean square 
separation of the lifespans. In this case, one finds an 
inverse-U shaped distribution (with the peak occurs at 
p = I ) . This implies that agents characterized by p = | 
(a coin-tossing strategy) have the best chances of sur- 
vival at asymptotically long times, as predicted analyti- 
cally in [17]. This important feature is explained by the 
global currents in the gene-space, which reduce the value 
of ox(p = 0), and have a negligible effect on ax(p = 5) 
[16,17]. We emphasize that these results hold true for 
both periodic and reflective boundary conditions. 

The efficiency of the system is defined as the number 
of agents in the minority room, divided by the maximal 
possible size of the minority group, (N — l)/2. Figure 
7 displays the efficiency as a function of the length scale 
Sp. The system's efficiency is a monotonic decreasing 
function of Sp. This is caused by the fact that larger Sp 
values imply larger temporal oscillations in the occupa- 
tion numbers of the rooms, thus decreasing the number of 
agents in the winning group (and increasing the number 
of agents in the losing room). 

Finally, we would like to address the last point raised 
in [20]. It is claimed that the fluctuations in the aver- 



2 



1.05r 



1 .025 - 



Q. 
D. 



0.975 



0.95 



R=0.1 
R=0.5 




0.4 0.6 0.8 1 

gene value p 

FIG. 3. The strategy distribution P(p) for different values 
of the prize to fine ratio: R = 0.1 and R = 0.5. The results 
are for TV = 10001 agents, d — — 4, <5p = 0.1, and periodic 
boundary conditions. Each point represents an average value 
over 10 runs and 100000 time steps per run. 
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The strategy distribution P(p) for different Sp val- 
0.1,0.25, and 0.4. The results are for TV = 10001 
= 0.9, d — —4, and periodic boundary conditions. 
Each point represents an average value over 10 runs and 
100000 time steps per run. 



age gene value <p> have been considered in [14]. How- 
ever, the oscillatory behavior of <p>, which is an highly 
important feature of the system's dynamics was not ob- 
served in [14]. Rather, Burgos et. al. [14] find a non- 
oscillatory value for <p>-i, see Eq. (15) of [14]. We have 
shown, on the other hand, that the quantity <p>-^ dis- 
plays temporal oscillations, with well defined frequency 
and amplitude [15,16]. It is important to distinguish be- 
tween regular temporal oscillations of the physical quan- 
tities (such as <p>) discussed in [15,16], as opposed to 
thermal fluctuations discussed in [14]. Thermal fluctu- 
ations of a thermodynamic system are essentially ran- 
dom in nature, whereas we have found regular oscilla- 
tions, which are characterized by well defined frequency 
and amplitude. The oscillatory nature of <p> [15,16] 
has been proven to be an essential feature which is re- 
sponsible for the dynamical phase transition (from self- 
segregation to clustering) observed in the EMG [17]. We 
would like to emphasize that these oscillations exist also 
for complex systems with a strategy updating rule of the 
form p — > p ± Sp, regardless of the value of Sp and the 
type of boundary conditions used (see. Fig. 2). 
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FIG. 5. The strategy distribution P(p) for different Sp val- 
ues: 8p = 0.1,0.25, and 0.4. The results are for TV = 10001 
agents, R — 0.9, d — —4, and reflective boundary condi- 
tions. Each point represents an average value over 10 runs 
and 100000 time steps per run. 
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FIG. 6. The average lifespan <L(p)> (solid curve) and 
<L(p)>+ctl{p) (dashed curve) of the agents. The results are 
for N = 10001 agents, R = 0.8, d = -i,Sp = 0.1, and peri- 
odic boundary conditions. Each point represents an average 
value over 10 runs and 100000 time steps per run. 
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FIG. 7. The efficiency of the system as a function of the 
length scale, 8p. Horizontal line represents the efficiency for a 
coin-tossing situation. The results are for TV = 10001 agents, 
R — 0.7, d — —4, and reflective boundary conditions. 
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